2025-01-02
Early development of computational text analysis in the 2000’s in political science : convert text to numbers to perform statistical analysis
But really booming over the last few years with advances in AI allowing to perform more complex analysis
Even more recent developemnts :
Using gilardi and Wuest 2018 different steps ?
Transformation of text corpora into numbers to perform statistical analysis
For this we need to use text-as-data, meaning, we have to transform our documents with a numerical representation : this is what text as data means : featurisation : want to represent a collection of documents to a numerical form
To learn from the data the function linking the text and the label, the machine needs to have numbers and not text so we need to transform our text in **a numerical representation
Learn feature representation of text : based on Natural Language Processing techniques, whole scientific field design to use computers to understand text
Social group detection in party manifestos